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[57] ABSTRACT 

A communication device, an apparatus, and a method 
for acoustic echo cancellation which makes use of a 
pseudonoise signal. An audio mixer adds the pseudo- 
noise signal to an input signal received from another 
communication device to produce a first composite 
signal. An audio system converts the first composite 
signal to sound in an at least partially enclosed space. 
The at least partially enclosed space produces an acous- 
tical echo in response. The audio system then converts 
the acoustical echo and other sounds in the at least 
partially enclosed space to a second composite signal. A 
signal processor cross-correlates the second composite 
signal with the pseudonoise signal to produce an esti- 
mate of the overall impulse response of the combined 
system formed by the at least partially enclosed space 
and the audio system. The processor then convolves the 
first composite signal with the impulse response esti- 
mate to produce an echo estimation signal. The echo 
estimation signal is an estimate of the component of the 
second composite signal which corresponds to the 
acoustical echo. The processor then subtracts the echo 
estimation signal from the second composite signal to 
produce an output signal. 

24 Claims, 5 Drawing Sheets 
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FIGURE 1 
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FIGURE 2 
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FIGURE 3 
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COMMUNICATION DEVICE, APPARATUS, AND 
METHOD UTILIZING PSEUDONOISE SIGNAL 

FOR ACOUSTICAL ECHO CANCELLATION 

5 

The United States Government has certain rights in 
this invention pursuant to Contract No. ITA 87-02 be- 
tween the U.S. Department of Commerce and Iowa 
State University. 

FIELD OF THE INVENTION 10 

The present invention relates generally to appara- 
tuses and methods for canceling an acoustical echo in an 
at least partially enclosed space which is detected by the 
audio system of a communication device. In particular, 15 
it pertains to an apparatus and a method which cancel 
the acoustical echo by utilizing a pseudorandom noise 
signal to estimate the overall impulse response of the 
combined system formed by the space and the audio 
system. 20 

BACKGROUND OF THE INVENTION 

Many types of communication devices exist which 
allow for hands free communication between two 
parties in separate rooms. Such devices include speaker- 25 
phones, public address systems for auditoriums or meet- 
ing rooms, and audio/visual equipment for video class- 
rooms. Furthermore, new technology is being rapidly 
developed which will make communication devices for 
audio/visual teleconferencing practical. 30 

The rooms used for this type of communication are 
typically plagued by acoustical echoes (i.e. acoustical 
reverberations). These acoustical echoes arise when the 
far-end communication device provides the near-end 
communication device with a far-end output audio sig- 35 
nal. This signal is then converted to sound by the audio 
system of the near-end communication device. In re- 
sponse, an acoustical echo is produced within the room. 
The echo along with the near-end user’s speech is con- 
verted to a near-end audio signal by the near-end audio 40 
system. The near -end audio signal is then transmitted to 
the far-end communication device as the near-end out- 
put audio signal. When this signal is converted to sound 
by the audio system of the far-end communication de- 
vice, the far-and user will have difficulty sorting out the 45 
near-end speech from the acoustical echo. 

A current approach to eliminating the acoustical 
echo is to use a discrete-time linear adaptive filter. Such 
an adaptive filter is used to estimate the overall impulse 
response of the combined system formed by the room 50 
and the near-end audio system. From this estimate, the 
adaptive filter generates an estimation signal which 
estimates the component of the near-end audio signal 
produced by the near -end audio system which corre- 
sponds to the acoustical echo in the room. The estima- 55 
tion signal is then subtracted from the audio signal to 
produce the near-end signal. 

A major problem associated with this approach is 
that the convergence time for estimating the overall 
impulse response of the room and audio system together 60 
may be much longer than the stationary period of the 
overall impulse response. As a result, changes in the 
room characteristics will lead to serious degradation of 
the performance of the adaptive filter because it cannot 
adapt rapidly enough. Such changes may include doors 65 
being opened or closed, movement of furniture or peo- 
ple, or changes in the direction of the microphone of the 
audio system. 
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Another flaw associated with this approach is that the 
presence of near-end speech is not readily handled by 
the adaptive filter. When near-end speech is added to 
the return path, it is suppressed by the adaptive filter. In 
order to alleviate this problem, conventional adaptive 
filter echo cancelers employ near-end speech detectors. 
These detectors are used to detect large near-end 
speech energy so that the adaptive filter computations 
can be suspended during the time interval of the near- 
end speech. This means that echo canceling is sus- 
pended during near-end speech. One undesirable result 
of this is simplex or one-way conversations. A second 
undesirable result is the inability of the adaptive filter to 
adapt to room changes during the time interval of the 
near-end speech. 

OBJECTS OF THE INVENTION 

It is therefore an object of the invention to provide a 
communication device, an apparatus, and a method for 
acoustical echo cancellation which can rapidly adapt to 
changes in room characteristics. 

It is another object of the invention to provide a 
communication device, an apparatus, and a method for 
acoustical echo cancellation which does not suppress 
near-end speech. 

It is still another object of the invention to provide a 
communication device, an apparatus, and method for 
acoustical echo cancellation which allows for full- 
duplex, hands-free, two-way conversation. 

SUMMARY OF THE INVENTION 

The foregoing and other objects of the invention are 
achieved by a communication device, an apparatus, and 
a method for acoustic echo cancellation which makes 
use of an acoustic pseudonoise signal. The communica- 
tion device includes the echo cancellation apparatus 
and an audio system. The echo cancellation apparatus 
includes a pseudonoise signal generator, an audio mixer, 
an analog-to-digital converter, a digital-to-analog con- 
verter, and a digital signal processor. 

The audio mixer adds the pseudonoise signal to an 
input signal received from another communication de- 
vice to produce a first composite signal. The audio 
system converts the first composite signal to sound in an 
at least partially enclosed space. The at least partially 
enclosed space produces an acoustical echo in response. 
The audio system then converts the acoustical echo and 
other sounds in the at least partially enclosed space to a 
second composite signal. 

The analog-to-digital converter samples the pseudo- 
noise signal and the first and second composite signals 
and converts them to corresponding digital signals. The 
digital signal processor cross-correlates the second 
composite signal with the pseudonoise signal to pro- 
duce an estimate of the overall impulse response of the 
combined system formed by the at least partially en- 
closed space and the audio system. The processor then 
convolves the first composite signal with the impulse 
response estimate to produce an echo estimation signal. 
The echo estimation signal is an estimate of the compo- 
nent of the second composite signal which corresponds 
to the acoustical echo. Finally, the processor subtracts 
the echo estimation signal from the second composite 
signal to produce a digital output signal. The digital 
output signal is then converted to a corresponding ana- 
log output signal by the digital-to-analog converter for 
transmission to the other communication device. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The foregoing and other objects of the invention will 
be more clearly understood from the following detailed 
description and appended claims when read in connec- 5 
tion with the following drawings, in which: 

FIG. 1 shows a block diagram of an audio network 
having a near-end communication device in accordance 
with the present invention; 

FIG. 2 shows a detailed illustration of the near-end 10 
communication device of FIG. 1 including the near-end 
echo canceler and an associated near-end audio system; 

FIG. 3 shows a detailed illustration of the digital 
signal processor of the near-end echo canceler shown in 
FIG. 2; 15 

FIG. 4 shows a flow diagram of the adaptive and 
cross-correlation routines of the digital signal processor 
shown in FIG. 3; 

FIG. 5 shows another embodiment of an audio net- 
work in accordance with the present invention. 20 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

FIGS. 1-5 provide illustrations of the invention dis- 
closed herein. In these figures, like components are 25 
designated by like numerals. 

Referring initially to FIG. 1, there is shown a block 
diagram of a full-duplex audio network 10 which con- 
nects an at least partially enclosed near-end space 12 
and an at least partially enclosed far-end space 14. The 30 
at least partially enclosed spaces 12 and 14 may each be 
(1) a room in a building or home, (2) an auditorium, (3) 
a meeting room (4) a passenger compartment of a car, 

(5) a classroom, (6) a teleconferencing room or (7) some 
other at least partially enclosed structure. 35 

The audio network 10 of FIG. 1 includes a near-end 
communication device 16 located in the near-end space 
12 and a far-end communication device 18 located in the 
far-end space 14. The near-end and far-end communica- 
tion devices 16 and 18 may each be (1) a speakerphone 40 
for a room or a car, (2) a public address (PA) system for 
a meeting room, auditorium, or classroom, (3) audio/- 
visual equipment for a video classroom or a telecon- 
ferencing room, (4) a telephone, or (5) some other com- 
munication device having an audio system. 45 

For the audio network 10 shown in FIG. 1, the far- 
end communication device 18 includes a far-end audio 
system 20 but not an echo canceler. The far-end audio 
system 20 may be of the type commonly found in any of 
the communication devices which were described ear- 50 
lier as being suitable for the far-end communication 
device 18. 
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munication device 16. This transmitted signal, v2(t), 
suffers only a minor amount of distortion during the 
transmission process and is received by the near-end 
communication device 16 as the analog near-end audio 
input signal, yl(t). The transport system 28 may be (1) a 
satellite transmission system, (2) a microwave transmis- 
sion system, (3) a cellular transmission system, (4) a fiber 
optic transmission system, (5) a wire transmission sys- 
tem, or (6) some other similar transmission system. 

FIG. 2 provides a more detailed illustration of the 
near-end communication device 16. It includes a near- 
end echo canceler 30 and a near-end audio system 32. 

The spread spectrum pseudonoise signal generator 34 
of the near-end echo canceler 30 generates a spread 
spectrum pseudonoise signal, c(t), in analog format. The 
spread spectrum signal, c(t), can be generated using 
either random binary pulse code modulation (PCM) or 
binary phase shift keying (BPSK). This signal, c(t), can 
have a code length approximately in the range of 
4,000-300,000 chips and can be generated at a code rate 
approximately in the range of 8,000-64,000 chips per 
second. In addition, this signal, c(t), can be generated as 
simply a baseband signal or can be generated with a 
carrier component. Moreover, the amplitude of the 
spread spectrum signal, c(t), can be selected to be ap- 
proximately in the range 10-25 dB below the near-end 
input audio signal (i.e. the received far-end output audio 
signal), yl(t). 

In the preferred embodiment, the spread spectrum 
signal, c(t) , has a code length of 262,143 chips and is 
generated as a baseband signal at a code rate of 8,000 
chips per second. Also, in the preferred embodiment, 
the amplitude of the signal, c(t), is selected to be approx- 
imately 15 dB below the near-end input audio signal, 

yi(t) • 

The audio mixer 36 of the echo canceler 30 receives 
the spread spectrum signal, c(t), and the near-end input 
audio signal (i.e. the received far-end output audio sig- 
nal), yl(t). The audio mixer 36 is of a conventional type 
and combines these signals, c(t) and yl(t), to produce 
the analog composite audio signal, y2(t)=yl(t)+c(t). 
This signal, y2(t), is the input to the near-end audio 
system 32. 

The near-end audio system 32 may be of the type 
commonly found in any of the communication devices 
which were described earlier as being suitable for the 
near-end communication device 16. However, the pre- 
ferred embodiment of the audio system 32 is illustrated 
in FIG. 2. 

As shown in FIG. 2, the near-end audio system 32 
includes the conventional near-end audio electronics 38. 
The graphic equalizer 40 of the near-end audio electron- 


As is described later in detail, the microphone 22 of 
the far-end audio system 20 detects the pressure waves 
of the far-end user’s speech, sl(t), the far-end back- 
ground noise, nl(t), and a far-end echo, el(t). The far- 
end microphone 22 converts the detected pressure 
waves of these sounds to an analog composite audio 
signal, vl(t). 

The composite audio signal, vl(t), is then amplified 
and/or filtered with the conventional audio electronics 


ics 38 receives the composite input audio signal, y2(t), 
from the audio mixer 36 of the echo canceler 30. The 
55 graphic equalizer 40 is of a conventional type and is 
used to filter the composite input audio signal, y2(t), so 
that it can be properly converted to sound by the loud- 
speaker 42 of the audio system 32. In the preferred 
embodiment, the graphic equalizer 40 filters out fre- 
60 quencies of this signal, y2(t), which are not in the range 
of 200-4,000 Hz. 


26 of the far-end audio system 20 to produce the analog 
far-end composite output audio signal, v2(t). This is 
done so that the far-end output audio signal, v2(t), can 
be properly transmitted by the conventional transport 
system 28. 

The far-end output audio signal, v2(t), is then trans- 
mitted by the transport system 28 to the near-end com- 


The audio power amplifier 44 of the near-end audio 
electronics 38 receives the filtered composite audio 
signal, y3(t), from the graphic equalizer 40. The audio 
65 power amplifier 44 is of a conventional type and is used 
to drive the low impedance load of the loudspeaker 42. 
In other words, the audio power amplifier 44 amplifies 
the filtered composite audio signal, y3(t), to a level at 
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which it can be properly converted to sound by the (2) the analog spread spectrum signal, c(t), from the 
near-end loudspeaker 42. signal generator 34 and (3) the analog composite input 

The near-end loudspeaker 42 receives the amplified audio signal, y2(t), from the audio mixer 36. The A/D 
and filtered composite audio signal, y4(t), from the converter 50 is of a conventional type and converts 
audio amplifier 44. The loudspeaker 42 is of a conven- 5 these analog signals, y7(t), c(t), and y2(t), to the corre- 
tional type and converts this signal, y4(t), to sound in sponding digital signals, y7(n), c(n), and y2(n). As a 

the at least partially enclosed near-end space 12. result, the A/D converter 50 outputs each of these 

The near-end space 12 has a characteristic impulse digital signals, y7(n), c(n), and y2(n), to the digital signal 
response, g2(t). When the filtered and amplified com- processor (DSP) 52 as a sequence of discrete samples at 

posite audio signal, y4(t), is converted to sound, the 10 a sampling rate approximately in the range of 8-64 

near-end space 12 produces an acoustical echo or rever- KHz. 

beration, e2(t)=y4(t) * g2(t), in response. The echo, In the preferred embodiment, the A/D converter 50 
e2(t), is due to the impulse response, g2(t), of the near- has a sampling rate of 8 kHz. This results in the samples 

end space 12 and is the convolution of the signal, y4(t), of the digital signals, y7(n), c(n), and y2(n), being re- 

with the impulse response, g2(t). 15 ceived by the DSP 52 at a rate of 8 KHz. 

The microphone 46 of the near-end audio system 32 FIG. 3 provides a more detailed illustration of the 
detects the pressure waves of the echo, e2(t), the near- DSP 52 of the echo canceler 30. It includes a central 

end user’s speech, s2(t), and the background noise, n2(t), processing unit (CPU) 54 and a memory 56. The mem- 

in the near-end space 12. The microphone 46 converts ory 56 stores an adaptive filter routine 58 and a cross- 
the detected pressure waves of these sounds to an ana- 20 correlation routine 60 which are run simultaneously by 
log near-end composite audio signal, y5(t). Thus, the the CPU 54 for echo cancellation purposes, 
composite audio signal, y5(t), has (1) a speech compo- FIG. 4 provides an illustration of the operation of the 
nent which corresponds to the near-end speech, s2(t), DSP 52 in terms of the functions of the adaptive filter 

(2) a noise component which corresponds to the back- routine 58 and the cross-correlation routine 60. As 

ground noise, n2(t), and (3) an echo component which 25 shown in FIG. 4, steps 62-76 pertain to the functions of 
corresponds to the acoustical echo, e2(t). the adaptive filter routine 58 while the steps 78-84 per- 

From the microphone 46, the composite audio signal, tain to the functions of the cross-correlation routine 60. 
y5(t), is provided to the conventional audio preamplifier The echo component, y2(n) * h2(n), of the digital 
48 of the near-end audio electronics 38. The audio pre- near-end composite output audio signal, y7(n), is the 
amplifier 48 amplifies the composite audio signal, y5(t), 30 convolution of the digital near-end composite audio 
with minimal introduction of noise, to a level at which input signal, y2(n), with the discrete time overall im- 
it can be properly processed by the echo canceler 30 pulse response, h2(n), of the combined system formed 
and can be properly transmitted by* the transport system by the near-end space 12 and the near-end audio system 
28. 32. In general terms, the cross-correlation routine 60 

The audio line amplifier 49 of the near-end electron- 35 provides a method for cross-correlating the near-end 
ics 38 receives the preamplified composite audio signal, composite output audio signal, y7(n), with the digital 
y6(t). The audio line amplifier 49 is of a conventional spread spectrum code signal, c(n), to produce a first 
type and amplifies the composite audio signal, y6(t), so estimate, filn, of the overall impulse response, h2(n). 
as to drive the low impedance loads of the echo can- And, the adaptive filter routine 58 provides a least- 
celer 30 and the transport system 28. This amplified 40 mean-squares type method for generating a second esti- 
analog composite audio signal, y7(t), is the output of the mate, R2n, of the overall impulse response, h2(n). 
near-end audio system 32. If near-end speech, s2(t), is present, the adaptive filter 

Thus, the near-end audio system 32 has two general routine 58 will convolve the digital near-end composite 

functions. First, it converts the composite input audio input audio signal, y2(n), with the first impulse response 

signal, y2(t), to sound in the near-end space 12. Second, 45 estimate, hln, generated by the cross-correlation rou- 
it converts the acoustical echo, e2(t), the near-end tine 60. However, if there is no near-end speech, s2(t), 
speech, s2(t), and the near-end noise, n2(t), to the com- then the adaptive filter routine 58 will convolve the 

posite output audio signal, y7(t). In view of this, it is signal, _y2(n), with the second impulse response esti- 

clear that the near-end audio system 32 and the near-end mate, h2n, it has generated. In either case, the signal 

space 12 together form a combined system that has an 50 resulting from the convolution is subtracted from the 
overall impulse response, h2(t). Therefore, the compos- near-end audio output signal, y7(n), to remove the ear- 
ite output audio signal, y7(t), is the result of the convo- Iier described echo component, y2(n) * h2(n). 
lution of the composite input signal, y2(t), with the More specifically, for the adaptive filter routine 58, 
overall impulse response, h2(t). the first step 62 is to read in the next samples of (1) the 

As is the case with the composite audio signal, y5(t), 55 digital near-end composite input audio signal, y2(n), (2) 

the composite output audio signal, y7(t), of the audio the digital near-end composite output audio signal, 

system 32 has (1) a speech component which corre- y7(n), and (3) the digital spread spectrum signal, c(n). 

sponds to the near-end speech, s2(t), (2) a noise compo- The samples for these signals, y2(n), y7(n), and c(n), are 

nent which corresponds to the background noise, n2(t), all provided by the A/D converter 50. 

and (3) an echo component which corresponds to the 60 Once these samples are read in, the next step 64 is for 
acoustical echo, e2(t). Furthermore, the echo compo- the adaptive routine 58 to generate (1) a spread spec- 

nent, y2(t) * h2(t), of the composite output audio signal, trum sample vector, Cln, for the spread spectrum sig- 

y7(t), is the convolution of the near-end composite nal, c(n), and (2) a composite input sample vector, Y2n, 

audio input signal, y2(t), with the overall impulse re- for the near-end composite input audio signal, y2(n). 

sponse, h2(t) . 65 The spread spectrum sample vector, Cln, for the spread 

The analog to digital (A/D) converter 50 of the echo spectrum signal, c(n), has a length, Nl, and contains the 
canceler 30 receives (1) the analog composite output Nl most recently read-in samples of the spread spec- 

audio signal, y7(t), from the near-end audio system 32, trum signal, c(n), including the sample read in during 
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step 62. The composite input sample vector, Y2n, has a the A/D converter 50. The near-end output sample 

length, N2, and contains in reverse discrete time order vector, Y9n, has a length equal to the length, Nl, of the 

the N2 most recently read-in samples of the near-end spread spectrum sample vector, Cln, and contains the 

composite input audio signal, y2(n), including the sam- Nl most recent samples of the near-end output audio 

pie read in during step 62. 5 signal, y9(n), generated by the adaptive filter routine 58 

The length, Nl, of the spread spectrum sample vec- in step 70. 
tor, Cln, is chosen such that the adaptive filter routine Because of the subtraction performed in step 70, sub- 
58 can quickly detect the onset of near-end speech, s2(t). stantially all of the earlier described echo component, 

This length, Nl, can be approximately in the range of y2(n) * h2(n), of the near-end composite output audio 

5-50 samples. In the preferred embodiment, the length 10 signal, y7(n), is canceled from this signal, y7(n). The 
Nl of the sample vector, Cln, is six. result is the substantially echoless digital near-end out- 

The length, N2, of the composite input sample vec- put audio signal y9(n). As shown in FIG. 2, this signal, 

tor, Y2n, is equal to the length of the estimated impulse y9(n), is the output of the DSP 52 and is provided to the 

responses, hln and fi2n. This length, N2, can be approx- digital to analog (D/A) converter 86. 

imately in the range of 300-2,000 samples and is chosen 15 Step 72 is a decision step for the adaptive filter rou- 
to be sufficient for high fidelity modeling of the overall tine 58. In this step 72, the adaptive filter routine 58 

impulse response, h2(n). However, in the preferred determines whether or not the decision value, d(n), 

embodiment the length, N2, is 400 samples. computed in step 66 exceeds the threshold value, T. 

In step 66, the adaptive filter routine 58 computes a As was indicated earlier, the decision value, d(n), will 
decision value, d(n). This decision value, d(n), is com- 20 not exceed the threshold value, T, when near-end 
puted by first computing the inner product of the spread speech, s2(t), is present. This can be explained as fol- 

spectrum sample vector, Cln, and the output sample lows. When near-end speech, s2(t), is present, the near- 
vector, Y9n-1, computed during the previous loop in end composite output audio signal, y7(n), will grow in 

step 70. This inner product is then divided by the inner amplitude. This occurs because the signal, y7(n), will 

product of the previous output sample vector, Y9n-1, 25 have a component corresponding to the near-end 
with itself. As will be discussed later with respect to speech, s2(t). Therefore, this signal, y7(n), will be sub- 
steps 68 and 72, the decision value, d(n), is high when stantially different from the composite input audio sig- 

near-end speech, s2(t), is not present and is low when nal, y2(n). Then, when these two signals, y7(n) and 
near-end speech, s2(t), is present. y2(n), are subtracted in step 70, the result will be a large 

In the following step 68, the adaptive filter routine 58 30 near-end output audio signal, y9(n). Thus, the most 
generates a digital echo estimation signal, yS(n). As was recent samples of the near-end output sample vector, 
indicated earlier, the echo component of the analog Y9n, computed in step 70 will be relatively large in 
signal, y7(t), is the convolution of the analog near-end amplitude. As a result,- a small value for the decision 
composite audio input signal, y2(t), with the overall value, d(n), will have been computed in step 66 because 
impulse response, h2(t). The digital echo estimation 35 of the large value of the inner product of the output 
signal, yS(n), is therefore used to estimate the echo sample vector, Y9n, with itself compared to the smaller 
component of the corresponding digital signal, y7(n), value of the inner product of the output sample vector, 
and is computed in the following manner. Y9n, with the spread spectrum sample vector, Cln. 

As was suggested earlier and as will be discussed later And, as was also suggested earlier, when near-end 
with respect to step 72, if near-end speech is not present, 40 speech, s2(t), is absent, the decision value, d(n), will 
then the decision value, d(n), is high. If in this case the exceed the threshold value, T. This occurs because the 
decision value exceeds a predefined threshold value, T, near-end composite output audio signal, y7(n), no 
the adaptive filter routine 58 generates the echo estima- longer has a component corresponding to near-end 
tion signal yS(n) by computing the inner product of the speech, s2(t). Therefore, this signal, y7(n), will not be 
composite input sample vector, Y2n, and the impulse 45 substantially different from the composite input audio 
response estimate, h2n-l, generated by the adaptive signal, y2(n). Furthermore, when these two signals, 
filter routine 58 in step 76 during the previous loop. y 7 (n) and y2(n), are subtracted in step 70, the result will 
Since the samples of the composite input sample vector, be a small near-end output audio signal, y9(n). Thus, the 
Y2n, are in reversed time order, this inner product is the most recent samples of the near-end output sample vec- 
convolution of the composite input audio signal, y2(n), 50 tor, Y9n, computed in step 70 will be relatively small in 
with the impulse response estimate, h2n-l. amplitude. As a result, a large value for the decision 

On the other hand, if near-end speech, s2(t), is pres- value, d(n), will have been computed in step 66 because 
ent, the decision value d(n) is low. This will be de- of the small value of the inner product of the output 
scribed in detail with respect to step 72. If the decision sample vector, Y9n, with itself compared to the compa- 
value d(n) does not exceed the threshold value, T, the 55 rable value of the inner product of the output sample 
adaptive filter routine 58 generates the echo estimation vector, Y9n, with the spread spectrum sample vector, 
signal yS(n) by computing the inner product of the Cln. 

composite input sample vector, Y2n, and the impulse If the decision value, d(n), computed in step 66 does 
response estimate, filn, generated by the cross-correla- not exceed the threshold value, T, then the adaptive 
tion routine 60 in step 84. This inner product is the 60 filter 58 bypasses the steps 74 and 76 and begins a new 
convolution of the composite input audio signal, y2(n), loop at the step 62. Thus, the adaptive filter routine 58 
with the impulse response estimate, filn. continues to use the impulse response estimate, hln, of 

In step 70, the adaptive filter routine 58 generates (1) the cross-correlation routine 60 to compute the echo 
a digital near-end output audio signal, y9(n), and (2) a estimation signal, yS(n), in step 68 until the decision 
near-end output sample vector, Y9n. The output audio 65 value, d(n), computed in step 66 exceeds the threshold 
signal, y9(n), is produced by subtracting the echo esti- value, T. 

mation signal, yS(n), generated in step 68 from the near- If the decision value, d(n), computed in step 66 does 
end composite output audio signal, y7(n), provided by exceed the threshold value, T, then the adaptive filter 
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58 performs steps 74 and 76 before beginning a new as does the spread spectrum sample vector, C2n, and 

loop at step 62. Thus, the adaptive filter routine 58 contains the N3 most recently read-in samples of the 

continues to use the impulse response estimate, ii2n-l, of near-end composite output audio signal, y7(n). This 

the cross-correlation routine 60 to compute the echo length, N3, can be approximately in the range of 

estimation signal, yS(n), in step 68 until the decision 5 4,000-300,000 samples and is chosen to be sufficient for 
value, d(n), computed in step 66 does not exceed the proper cross-correlation of the near-end composite out- 

threshold value, T. put audio signal, y7(n), and the spread spectrum signal. 

The threshold value, T, is selected so that the deci- c(n). 

sion value, d(n), will exceed it when near-end speech, In step 80, the cross-correlation routine 60 computes 
s2(t), is absent and will not exceed it when near-end 10 the product, H, of the Fast Fourier Transform (FFT) of 
speech, s2(t), is present. The threshold value, T, is 0.5 in the near-end composite output samp le vector, Y7n, and 

the preferred embodiment. the complex conjugate of the FFT of the spread spec- 

In the following step 74, the adaptive filter routine 58 trum sample vector, C2n. This produces the FFT of the 

computes an estimate of the component, Zn, of the impulse response estimate, &3n, computed in the next 

composite input sample vector, Y2n, which is orthogo- 15 step 82. 

nal to the composite input sample vector, Y2n-1, of the In step 82, as was just suggested, the cross-correlation 
previous loop. This orthogonal component vector, Zn, routine 60 computes the impulse response estimate, fi3n. 

has a length equal to the length, N2, of the vector, Y2n. This is accomplished by computing the real portion of 

This is done by first computing the inner product of the Inverse Fast Fourier Transform (IFFT) of the prod- 

the composite input sample vector, Y2n, with the com- 20 uct, H, computed in step 80 and dividing it by the 

posite input sample vector, Y2n-1, from the previous length, N3, of the spread spectrum vector, C2n. 

loop. This inner product is then divided by the inner The impulse response estimate, fi3n, has a length, N2. 
product of the vector, Y2n-1, with itself. Then the prod- As was mentioned earlier, this length, N2, can be ap- 

uct of this ratio and the vector, Y2n-1, is subtracted proximately in the range of 300-2,000 samples and is 

from the vector, Y2n, to produce the orthogonal vec- 25 chosen to be sufficient for high fidelity modeling of the 
tor, Zn. overall impulse response, h2(n). As was also mentioned, 

As mentioned earlier, the length, N2, of the orthogo- in the preferred embodiment the length, N2, is 400 sa- 
nal component vector, Zn, can be approximately in the mles. 

range of 300-2,000 samples. And, in the preferred epi- The computation in step 82 together with that in step 
bodiment the length, N2, is 400 samples. 30 80 provides a circular cross-correlation of the near-end 

The next step 76 of the adaptive routine 58 is to com- composite output audio signal, y7(n), and the spread 

pute an updated least-squares estimate, h2n, of the dis- spectrum signal, c(n). The result is the accurate impulse 

crete time overall impulse response, h2(n), of the com- response estimate, fi3n, during near-end speech, s2(t). 

bined system formed by the near-end space 12 and the The just described FFT/IFFT method of computing 
near-end audio system 32. This is done by first comput- 35 an estimate of the cross-correlation of the signals, y7(n) 
ing the product of the decision value, d(n), and the and c(n), is just one possible way to make the computa- 

near-end output audio signal, y9(n). This product is then tion efficient. Another way would be to compute the 

divided by the inner product of the composite input time average product of the composite output sample 

sample vector, Y2n, and the orthogonal component vector, Y7n, with the spread spectrum sample vector, 

vector, Zn. The resulting scaler, is multiplied by the 40 C2n. 

vector, Zn, to produce a correction vector. This correc- Step 84 of the cross-correlation routine 60 makes the 
tion vector is then added to the least-mean-square im- impulse response estimate, h3n, even more accurate, 

pulse response estimate, h2n-l, from the previous loop This is accomplished by averaging the current estimate, 

to produce the updated least-mean-square impulse re- E3n, along with the 100 most recent estimates computed 

sponse estimate, h2n. 45 in step 82 to form the impulse response estimate, tin. 

As was mentioned earlier, the updated impulse re- Like the impulse response estimate, h3n, the averaged 
sponse estimate, hln, has a length, N2. And, as was also impulse response estimate, hln, has a length, N2. As 

mentioned earlier, this length, N2, can be approximately was mentioned earlier, this length, N2, can be approxi- 

in the range of 300-2,000 samples and is chosen to be mately in the range of 300-2,000 samples and in the 
sufficient for high fidelity modeling of the overall im- 50 preferred embodiment the length, N2, is 400 samples, 
pulse response, h2(n). In the preferred embodiment the The averaged impulse response estimate, filn, is con- 
length, N2, is 400 samples. tinuously available to the adaptive filter routine 58 but 

Upon completion of step 76, the adaptive filter rou- used only when the decision value d(n) does not exceed 

tine 58 returns to step 62 to begin a new loop. However, the threshold value, T. Upon completion of step 84, the 

at the same time that the adaptive filter routine 58 is 55 cross-correlation routine 60 returns to step 78 to begin 
running the cross-correlation routine 60 is also running. the next loop. 

The first step 78 of the cross-correlation routine 50 is Returning to FIG. 1, the digital near-end output 
to (1) generate a near-end composite output sample audio signal, y9(n), produced by the adaptive filter 

vector, Y7n, for the near-end composite output audio routine 58 is then outputted by the DSP 52 to the digi- 

signal, y7(n), and (2) generate a second spread spectrum 60 tal-to-analog converter (D/A) 86. The D/A converter 
sample vector, C2n, for the spread spectrum signal, 86 is of a conventional type and converts the digital 

c(n). The near-end composite output signal, y7(n), and near-end output audio signal, y9(n), to its corresponding 

the spread spectrum signal, c(n), are read in by the analog signal, y9(t). 

adaptive filter routine 58 in step 62. The analog near-end output audio signal, y9(t) is then 

The spread spectrum sample vector, C2n, has a 65 transmitted by the transport system 28 to the far-end 
length, N3, and contains the N3 most recently read-in communication device 18. This transmitted signal, 

samples of the spread spectrum signal, c(n). The near- y9(t), suffers only a minor amount of distortion during 

end output sample vector, Y7n, has the same length, N3, the transmission process and is received by the far-end 
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communication device 18 as the analog far-end input 
audio signal, v3(t). 

The far-end input audio signal (i.e. the near-end out- 
put audio signal), v3(t), is then filtered and/or amplified 
with the conventional audio electronics 26 of the far- 
end audio system 20 to produce the far-end audio signal, 
v4(t). This is done so that the audio signal, v4(t), can be 
properly converted to sound by the loudspeaker 88 of 
the far-end audio system 20. 


and adaptive filter routine 58 to generate an echo esti- 
mation signal, yS(n), which does not result in the sup- 
pression of the component of the audio output signal, 
y7(t), which corresponds to the near-end speech, s2(t). 
In addition, unlike the prior art, the described method 
for switching between the impulse response estimates, 
Sin and fi2n, allows for duplex conversation (i.e. double 
talk). And lastly, the combination of the echo canceler 
and adaptive filter routine 58 and the cross-correlation 


The far-end loudspeaker 88 receives the audio signal, 10 routine 60 allows for quick adaptation to changes in the 
v4(t), from the audio electronics 26. The loudspeaker 42 actual overall impulse response, h(t), of combined sys- 
is of a conventional type and converts this signal, v4(t), tern formed by the near-end space 12 and the near-end 
to sound. As a result, the far-end echo, el(t)=v4(t) * audio system 32. 

gl(t), is produced in the far-end space 14 due to the Numerous other alternatives exist for the audio net- 
characteristic impulse response, gl(t), of the far-end 15 wor i, jq. For example, in FIG. 2, rather than having the 
space 14. The echo, el(t), is the convolution of the near-end echo canceler 30 and the near-end audio elec- 
signal, v4(t), with the impulse response, g2(t). tronics 38 located within the near-end space 12, both or 

As was described earlier, the microphone 22 detects portions of both may be located external to the near-end 

the pressure waves of the far-end speech, sl(t), and the space 12. Furthermore, the near-end audio system 32 

far-end background noise, nl(t), and the far-end echo, 20 may include multiple loudspeakers 42 and/or multiple 
el(t). Therefore, when the composite input audio signal, microphones 46. In the event that the audio system 32 

y2(t), is converted to sound by the near-end audio sys- includes multiple microphones 46, the audio preampli- 

tem 32, the near-end user will hear far-end speech and p ler 43 w m b e replaced by a conventional audio mixer 

background noise distorted by the far-end echo. which ^^5 a preamplifier and which is coupled to 

In order to alleviate this problem, a far-end echo 25 each of the multip]e microphones 46. 
canceler 90 can be added to the far-end communication Whi]e th£ present invention has been described with 
device 18, as shown in FIG. 5. The echo canceler 90 is reference to a few specific embodiments, the descrip- 

configured like the echo canceler 30 described earlier. pon j s i]] us trative of the invention and is not to be con- 

It removes the echo component of the far-end compos- strued ^ ljm the inventl0n . Various modifications 

ite output audio signal, v2(t), which corresponds to the 30 nnr „ r to those skilled in the art without departing 


acoustical echo, el(t). This echo component is due to 
the overall impulse response, hl(t), of the combined 
system formed by the far-end space 14 and the far-end 
audio system 20. Thus, the echo canceler 90 uses the 
same method discussed earlier with respect to echo 35 
canceler 30 for generating an estimate of the discrete 
time overall impulse response, hl(t). This estimate is 
then used by the echo canceler 90 in order to remove 
the echo component of the far-end composite output 
audio signal, v2(t), and produce in response the far-end 40 
output signal, v5(t). 

If either of the far-end or near-end communication 
devices 16 or 18 are a telephone, the far-end user’s head 
is for the most part coupled to the loudspeaker 88 so as 
to inhibit an echo, el(t), from being produced in the 45 
far-end space 14. As a result, the microphone 22 detects 
the pressure waves of the far-end speech, sl(t), and the 
far-end background noise, nl(t), and only a small 
amount of echo, el(t), if at all. As a result, when the 
composite input audio signal, y2(t), is converted to 50 
sound by the near-end audio system 32, the near-end 
user will hear far-end speech and far-end background 
noise but will hear very little far-end echo. Thus, in the 


may occur to those skilled in the art without departing 
from the true spirit and scope of the invention as defined 
by the appended claims. 

What is claimed is: 

1. A communication device comprising: 

a signal generator for generating a pseudonoise sig- 
nal; 

an audio mixer responsive to an input audio signal and 
said pseudonoise signal for combining said input 
audio signal with said pseudonoise signal to pro- 
duce a first composite signal; 

. an audio system responsive to said first composite 
signal for converting said first composite signal to 
sound in an at least partially enclosed space, said at 
least partially enclosed space producing an acousti- 
cal echo in response, said audio system also for 
converting said acoustical echo and other sounds in 
said at least partially enclosed space to a second 
composite signal, said second composite signal 
including an echo component corresponding to 
said acoustical echo, said at least partially enclosed 
space and said audio system together forming a 
combined system having an overall impulse re- 


case where the communication device 16 or 18 is a 
telephone, the addition of the far-end echo canceler 30 
or 90 will not have as drastic an effect in improving 
performance as is the case for other types of communi- 
cation devices. Nevertheless, the described echo can- 
celer 30 or 90 and associated method may be used with 
a telephone. 

The earlier described communication device 16 or 18, 
near-end echo canceler 30 or 90, and associated method 
provide several significant advantages over the prior 
art. These advantages are evident from the earlier de- 
scription of the communication device 16, the echo 
canceler 30, and associated method. 

In particular, the cross-correlation produced by the 
cross-correlation routine 60 enables the echo canceler 


sponse; 

55 means responsive to said second composite signal and 
said pseudonoise signal for generating a first esti- 
mate of said overall impulse response; 
means responsive to said first composite signal and 
said first estimate for generating an echo estimation 

60 signal corresponding to an estimate of said echo 
component of said second composite signal; and 
means for subtracting said echo estimation signal 
from said second composite signal to produce an 
output audio signal. 

65 2. The device of claim 1 wherein said means for gen- 

erating said first estimate includes means for cross-cor- 
relating said second composite signal and said psneudo- 
noise signal. 
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3. The device of claim 2 wherein said means for gen- for convolving said first composite signal with said first 
erating said echo estimation signal includes means for estimate. 

convolving said first composite signal with said first 10. The apparatus of claim 7 further comprising: 
estimate. means responsive to said output audio signal and said 

4 . The device of claim 1 further comprising: 5 pseudonoise signal for computing a decision value; 

means responsive to said output audio signal and said wherein said means for generating said echo estima- 

pseudonoise signal for computing a decision value; tion signal includes first means responsive to said 

wherein said means for generating said echo estima- decision value and to said first estimate for produc- 
tion signal includes first means responsive to said ing said echo estimation signal based on said first 

decision value and to said first estimate for produc- 1® estimate only when said decision value does not 

ing said echo estimation signal based on said first exceed a predefined threshold value, 

estimate only when said decision value does not 11. The apparatus of claim 10 wherein said first means 

exceed a predefined threshold value. includes means for convolving said first composite sig- 

5 . The device of claim 4 further comprising: nal with said first estimate. 

means responsive to said first composite signal and 15 12. The apparatus of claim 11 wherein said means for 

said output signal for generating a second estimate generating said first estimate includes means for cross- 

of said overall impulse response; correlating said second composite signal and said 

wherein said means for generating said echo estima- pseudonoise signal, 
tion signal includes second means responsive to 13. The apparatus of claim 10 further comprising: 
said decision value and to said second estimate for 20 means responsive to said first composite signal and 
producing said echo estimation signal based on said said output signal for generating a second estimate 

second estimate only when said decision value of said overall impulse response; 

exceeds said threshold value. wherein said means for generating said echo estima- 

6 . The device of claim 5 wherein: tion signal includes second means responsive to 

said means for generating said first estimate includes 2 said decision value and to said second estimate for 

means for cross-correlating said second composite producing said echo estimation signal based on said 

signal and said pseudonoise signal; and second estimate only when said decision value 

said first means includes means for convolving said exceeds said threshold value. 

first composite signal with said first estimate; 30 14. The apparatus of claim 13 wherein said second 

said second means includes means for convolving means includes means for convolving said first compos- 

said first composite signal with said second esti- ite signal with said second estimate, 

mate. 15- The apparatus of claim 14 wherein: 

7. An acoustical echo cancellation apparatus for use said means for generating said first estimate includes 

with an audio system, said audio system responsive to a 33 means for cross-correlating said second composite 

first composite signal for converting said first composite signal and said psnudonoise signal; and 

signal to sound in an at least partially enclosed space, said first means includes means for convolving said 
said at least partially enclosed space producing an first composite signal with said first estimate, 

acoustical echo in response, said audio system also for 16. A method of acoustical echo cancellation for use 
converting said acoustical echo and other sounds in said 43 with an audio system, said audio system responsive to a 
at least partially enclosed space to a second composite first composite signal for converting said first composite 

signal, said second composite signal including an echo signal to sound in an at least partially enclosed space, 
component corresponding to said acoustical echo, said said at least partially enclosed space producing an 

at least partially enclosed space and said audio system acoustical echo in response, said audio system also for 

together forming a combined system having an overall 45 converting said acoustical echo and other sounds in said 
impulse response, said apparatus comprising: at least partially enclosed space to a second composite 

a signal generator for generating a pseudonoise sig- signal, said second composite signal including an echo 
na l ; component corresponding to said acoustical echo, said 

an audio mixer responsive to an input audio signal and at least partially enclosed space and said audio system 
said psneudonoise signal for combining said input 50 together forming a combined system having an overall 
audio signal with said pseudonoise signal to pro- characteristic impulse response, said method compris- 
duce said first composite signal; ing the steps of: 

means responsive to said second composite signal and generating a pseudonoise signal; 

said pseudonoise signal for generating a first esti- combining an input audio signal with said pseudo- 
mate of said overall impulse response; 55 noise signal to form said first composite signal; 

means responsive to said first composite signal and generating a first estimate of said overall impulse 

said first estimate for generating an echo estimation response in response to said second composite sig- 

signal corresponding to an estimate of said echo nal and said pseudonoise signal; 

component of said second composite signal; and generating an echo estimation signal corresponding 

means for subtracting said echo estimation signal 60 to an estimate of said echo component in response 

from said second composite signal to produce said to said first composite signal and said first estimate; 

output audio signal. . and 

8. The apparatus of claim 7 wherein said means for subtracting said echo estimation signal from said sec- 
generating said first estimate cross-correlates said sec- ond composite signal to produce an output signal, 

ond composite signal and said pseudonoise signal for 65 17. The method of claim 16 wherein said step of gen- 
generating said first estimate. erating said first estimate includes the step of cross-cor- 

9. The apparatus of claim 8 wherein said means for relating said second composite signal and said pseudo- 
generating said echo estimation signal includes means random noise signal. 
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18. The method of claim 17 further comprising the 
steps of: 

computing a decision value in response to said output 
signal and said pseudonoise signal; 

wherein said step of generating said echo estimation 
signal includes the step of producing said echo 
estimation signal based on said first estimate only 
when said decision value does not exceed a prede- 
fined threshold value. 

19. The method of claim 18 further comprising the 
step of: 

generating a second estimate of said impulse response 
in response to said first composite signal and said 
output signal; 

wherein said step of generating said echo estimation 
signal includes the step of producing said echo 
estimation signal based on said second estimate 
only when said decision value exceeds said thresh- 20 
old value. 

20. The method of claim 18 wherein said step of pro- 
ducing said echo estimation signal based on said first 
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estimate includes the step of convolving said first com- 
posite signal with said first estimate. 

21. The method of claim 20 wherein said step of gen- 
erating said first estimate includes the step of cross-cor- 

5 relating said second composite signal and said pseudo- 
noise signal. 

22. The method of claim 21 wherein said step of pro- 
ducing said echo estimation signal based on said second 
estimate includes the step of convolving said second 

10 composite signal with said second estimate. 

23. The method of claim 22 wherein: 

said step of generating said first estimate includes the 
step of cross-correlating said second composite 
signal and said pseudonoise signal; and 
15 said step of producing said echo estimation signal 
based on said first estimate includes the step of 
convolving said first composite signal with said 
first estimate. 

24. The method of claim 17 wherein said step of gen- 
erating said echo estimation signal includes the step of 
convolving said first composite signal with said first 
estimate. 
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